Conditional Topic Random Fields

نویسندگان

  • Jun Zhu
  • Eric P. Xing
چکیده

Generative topic models such as LDA are limited by their inability to utilize nontrivial input features to enhance their performance, and many topic models assume that topic assignments of different words are conditionally independent. Some work exists to address the second limitation but no work exists to address both. This paper presents a conditional topic random field (CTRF) model, which can use arbitrary nonlocal features about words and documents and incorporate the Markov dependency between topic assignments of neighboring words. We develop an efficient variational inference algorithm that scales linearly in terms of topic numbers, and a maximum likelihood estimation (MLE) procedure for parameter estimation. For the supervised version of CTRF, we also develop an arguably more discriminative max-margin learning method. We evaluate CTRF on real review rating data and demonstrate the advantages of CTRF over generative competitors, and we show the advantages of max-margin learning over MLE.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conditional Random Fields for Airborne Lidar Point Cloud Classification in Urban Area

Over the past decades, urban growth has been known as a worldwide phenomenon that includes widening process and expanding pattern. While the cities are changing rapidly, their quantitative analysis as well as decision making in urban planning can benefit from two-dimensional (2D) and three-dimensional (3D) digital models. The recent developments in imaging and non-imaging sensor technologies, s...

متن کامل

A Conversational Movie Search System Based on Conditional Random Fields

Online streaming companies such as Netflix have become dominant in the media distribution sector. However, such media delivery services often support very rudimentary search, especially for natural language queries. To provide a more natural search interface, we have developed a conversational movie search system, which parses the recognition hypothesis of a spoken query into semantic classes u...

متن کامل

Predictive Random Fields: Latent Variable Models Fit by Multiway Conditional Probability with Applications to Document Analysis

We introduce predictive random fields, a framework for learning undirected graphical models based not on joint, generative likelihood, or on conditional likelihood, but based on a product of several conditional likelihoods each relying on common sets of parameters and predicting different subsets of variables conditioned on other subsets. When applied to models with latent variables, such as th...

متن کامل

WHU-BioNLP CHEMDNER System with Mixed Conditional Random Fields and Word Clustering

Our team participated in the Chemical Compound and Drug Name Recognition task of BioCreative IV. We used a mixed conditional random fields with word clustering to fulfillment this task. For one hand, we generate the word feature by word clustering and train the corpus with word feature to get one model. On the other hand, the training corpus is transformed to a new one in the reversed order of ...

متن کامل

Identification of chemical and gene mentions in patent texts using feature-rich conditional random fields

This article describes the application of Neji, a text-processing and concept recognition framework, to the automatic recognition of chemicals and gene mentions in medicinal chemistry patents. We used conditional random fields models trained with a otimized set of features including linguistic, orthographic, morphological, dictionary matching and local context features, dictionary-matching, and...

متن کامل

Chemical name recognition with harmonized feature-rich conditional random fields

This article presents a machine learning-based solution for automatic chemical and drug name recognition on scientific documents, which was applied in the BioCreative IV CHEMDNER task, namely in the chemical entity mention recognition (CEM) and the chemical document indexing (CDI) sub-tasks. The proposed approach applies conditional random fields with a rich feature set, including linguistic, o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010